23 research outputs found

    How valid can data fusion be?

    Get PDF
    "Data fusion techniques typically aim to achieve a complete data file from different sources which do not contain the same units. Traditionally, this is done on the basis of variables common to all files. It is well known that those approaches establish conditional independence of the specific variables given the common variables, although they may be conditionally dependent in reality. We discuss the objectives of data fusion in the light of their feasibility and distinguish four levels of validity that a fusion technique may achieve. For a rather general situation, we derive the feasible set of correlation matrices for the variables not jointly observed and suggest a new quality index for data fusion. Finally, we present a suitable and effcient multiple imputation procedure to make use of auxiliary information and to overcome the conditional independence assumption." (Author's abstract, IAB-Doku) ((en))Datenfusion, Datenaufbereitung, Datenqualität, Korrelation, Validität, angewandte Statistik, mathematische Statistik, Imputationsverfahren

    How valid can data fusion be

    Full text link
    "Data fusion techniques typically aim to achieve a complete data file from different sources which do not contain the same units. Traditionally, this is done on the basis of variables common to all files. It is well known that those approaches establish conditional independence of the specific variables given the common variables, although they may be conditionally dependent in reality. We discuss the objectives of data fusion in the light of their feasibility and distinguish four levels of validity that a fusion technique may achieve. For a rather general situation, we derive the feasible set of correlation matrices for the variables not jointly observed and suggest a new quality index for data fusion. Finally, we present a suitable and efficient multiple imputation procedure to make use of auxiliary information and to overcome the conditional independence assumption." (authors abstract

    MI Double Feature: Multiple Imputation to Address Nonresponse and Rounding Errors in Income Questions

    Get PDF
    Obtaining reliable income information in surveys is difficult for two reasons. On the one hand, many survey respondents consider income to be sensitive information and thus are reluctant to answer questions regarding their income. If those survey participants that do not provide information on their income are systematically different from the respondents - and there is ample of research indicating that they are - results based only on the observed income values will be misleading. On the other hand, respondents tend to round their income. Especially this second source of error is usually ignored when analyzing the income information. In a recent paper, Drechsler and Kiesl (2014) illustrated that inferences based on the collected information can be biased if the rounding is ignored and suggested a multiple imputation strategy to account for the rounding in reported income. In this paper we extend their approach to also address the nonresponse problem. We illustrate the approach using the household income variable from the German panel study "Labor Market and Social Security''

    Codebook and documentation of the panel study 'Labour Market and Social Security' (PASS) : Volume I: Introduction and overview. Wave 2 (2007/2008)

    Get PDF
    "The panel study 'Labour Market and Social Security' (PASS), established by the Institute for Employment Research (IAB), is a new dataset for labour market, welfare state and poverty research in Germany, creating a new empirical basis for the scientific community and for policy advice. This Datenreport provides an overview of the second survey wave, for which 12,487 individuals were interviewed in 8,429 households between December 2007 and July 2008. 10,114 individuals and 7,342 households were interviewed for the second time in the context of PASS. The spectrum of questions and the design of PASS are intended to close gaps in the existing stock of data. PASS has three main characteristics that extend analysis potential beyond that of the Federal Employment Agency's administrative data: 1. The panel takes the household context into account - including the situation before and after receipt of Unemployment Benefit II. 2. The panel is complete in that it covers all groups of persons and all employment biographies, not only people in dependent employment, unemployed people and those in need of assistance. The dataset also provides information on the status during phases of economic inactivity, self-employment or employment as civil servants. 3. The panel collects additional or significantly more detailed data on relevant characteristics such as attitudes, employment potential or job-search behaviour." (Author's abstract, IAB-Doku) ((en)) Additional Information Questionnaires of the second wave. Here you can find the German version. Further information about the panel study "Labour Market and Social Security".IAB-Haushaltspanel, Datengewinnung, Erhebungsmethode, Stichprobe, Panel - Methode, Datenaufbereitung

    Codebuch und Dokumentation des 'Panel Arbeitsmarkt und soziale Sicherung' (PASS) : Welle 2 (2007/2008)

    Get PDF
    "The panel study 'Labour Market and Social Security' (PASS), established by the Institute for Employment Research (IAB), is a new dataset for labour market, welfare state and poverty research in Germany, creating a new empirical basis for the scientific community and for policy advice. This "Datenreport" written in German provides an overview of the second survey wave, for which 12,487 individuals were interviewed in 8,429 households between December 2007 and July 2008. 10,114 individuals and 7,342 households were interviewed for the second time in the context of PASS. The spectrum of questions and the design of PASS are intended to close gaps in the existing stock of data. PASS has three main characteristics that extend analysis potential beyond that of the Federal Employment Agency's administrative data: 1. The panel takes the household context into account - including the situation before and after receipt of Unemployment Benefit II. 2. The panel is complete in that it covers all groups of persons and all employment biographies, not only people in dependent employment, unemployed people and those in need of assistance. The dataset also provides information on the status during phases of economic inactivity, self-employment or employment as civil servants. 3. The panel collects additional or significantly more detailed data on relevant characteristics such as attitudes, employment potential or job-search behaviour." (Author's abstract, IAB-Doku) ((en)) The english version of this "Datenreport" you can find here: http://fdz.iab.de/187/section.aspx/Publikation/k100607a04 Additional Information Hier finden Sie Band I des Datenreports: Einführung und Überblick Hier finden Sie Band II: Codebuch Haushaltsdatensatz Hier finden Sie Band III: Codebuch Personendatensatz Hier finden Sie Band IV: Codebuch Spelldaten, Registerdaten und Gewichte Fragebögen der 2. Welle Hier finden Sie die englische Version des Datenreports. Weitere Informationen zum Panel "Arbeitsmarkt und Soziale Sicherung".IAB-Haushaltspanel, Datengewinnung, Erhebungsmethode, Stichprobe, Panel - Methode, Datenaufbereitung

    An artificial intelligence algorithm is highly accurate for detecting endoscopic features of eosinophilic esophagitis

    Get PDF
    The endoscopic features associated with eosinophilic esophagitis (EoE) may be missed during routine endoscopy. We aimed to develop and evaluate an Artificial Intelligence (AI) algorithm for detecting and quantifying the endoscopic features of EoE in white light images, supplemented by the EoE Endoscopic Reference Score (EREFS). An AI algorithm (AI-EoE) was constructed and trained to differentiate between EoE and normal esophagus using endoscopic white light images extracted from the database of the University Hospital Augsburg. In addition to binary classification, a second algorithm was trained with specific auxiliary branches for each EREFS feature (AI-EoE-EREFS). The AI algorithms were evaluated on an external data set from the University of North Carolina, Chapel Hill (UNC), and compared with the performance of human endoscopists with varying levels of experience. The overall sensitivity, specificity, and accuracy of AI-EoE were 0.93 for all measures, while the AUC was 0.986. With additional auxiliary branches for the EREFS categories, the AI algorithm (AI-EoE-EREFS) performance improved to 0.96, 0.94, 0.95, and 0.992 for sensitivity, specificity, accuracy, and AUC, respectively. AI-EoE and AI-EoE-EREFS performed significantly better than endoscopy beginners and senior fellows on the same set of images. An AI algorithm can be trained to detect and quantify endoscopic features of EoE with excellent performance scores. The addition of the EREFS criteria improved the performance of the AI algorithm, which performed significantly better than endoscopists with a lower or medium experience level

    The Validity of Data Fusion

    No full text
    corecore